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Abstract 

Models of dynamic networks — networks that evolve over time 
- have manifold applications. We develop a discrete-time generative 
model for social network evolution that inherits the richness and flex- 
ibility of the class of exponential-family random graph models. The 
model — a Separable Temporal ERGM (STERGM) — facilitates sep- 
arable modeling of the tie duration distributions and the structural 
dynamics of tie formation. We develop likelihood-based inference for 
the model, and provide computational algorithms for maximum like- 
lihood estimation. We illustrate the interpretability of the model in 
analyzing a longitudinal network of friendship ties within a school. 
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1 Introduction 

Relational phenomena occur in many fields and are increasingly being rep- 
resented by networks. There is a need for realistic and tractable statistical 
models for these networks, especially when the phenomena evolves over time. 
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For example, in epidemiology there is a need for data-driven modeling of hu- 
man sexual relationship networks for the purpose of modeling and simulation 
of the spread of sexually transmitted disease. As |Morris and Kretzschmar" 



(1997) show, spread of such disease is affected not just by the momentary 
number of partnerships, but their timing. To that end, the models used must 
have realistic temporal structure as well as cross-sectional structure. 



Holland and Leinhardt (1977), Frank (1991), and others describe continuous- 



time Markov models for evolution of social networks. (See Doreian and Stok- 



man 



(1997) for a review.) The most popular parametrisation is the actor- 



oriented model described by Snijders (2005) and Snijders, van de Bunt, and 



Steglich (2010), which can be viewed in terms of actors making decisions to 
make and withdraw ties to other actors. This model was then extended by 



Snijders, Steglich, and Schweinberger (2007) to jointly model actors' network- 



related choices ("selection") and the effects of neighboring actors on each 
other's attributes ("influence"). 

Exponential-family random graph models (ERGMs) for social networks 
are a natural way to represent dependencies in cross-sectional graphs and 
dependencies between graphs over time, particularly in a discrete context. 



Robins and Pattison (2001) first described this approach. Hanneke and Xing 



(2007) and Hanneke, Fu, and Xing (2010) also define and describe a Temporal 
ERGM (TERGM) ("Discrete Temporal ERGM" in the |2007| publication), 
postulating an exponential family model for the transition probability from 
a network at time t to a network at time t + 1. 

Most of the attention in modeling of dynamic networks has focused on 



fitting the model to a network series (Snijders, 2001 Hanneke and Xing, 2007 



Hanneke et al. 2010) or an enumeration of instantaneous events between 
actors in the network (Butts, 2008). In the former case, the dyad census of 
the network of interest is observed at multiple time points. In the latter case, 
each event of interest and its exact time of occurrence is observed. 

A primary issue in modeling dynamic networks that has received limited 
attention is that of attribution of prevalence. A snapshot of a network at 
a single time point provides information about prevalence of the network 
properties of interest — such as the total number of ties — as opposed to 
properties of a dynamic network process that has produced it: incidence - 
the rate at which new ties are formed — and duration — how long they tend 
to last once they do. Multiple snapshots over the same set of actors (panel 
data) contain information about incidence and duration, but, as we show 
below, the model parametrisations presently in use do not allow convenient 
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control over this attribution of prevalence. 

In Section |2j we review discrete-time ERGM-based network models, and 
in Section [3j we extend these network models to provide a more interpretable 
and convenient parametrisation that separates incidence from duration. In 
Section |1| we develop conditional maximum likelihood estimators (CMLE) 
based on regularly-spaced network series data by extending the approach of 



Hunter and Handcock (2006). In Section 5 we illustrate the methodology 



with application to a longitudinal network of friendship ties within a school. 
In Section [6j we consider some extensions that the model framework suggests 
and allows. 



2 Discrete-Time ERGM-Based Models for Net- 
work Evolution 

We first consider a discrete-time dynamic network model in which the net- 
work at time t is a single draw from an ERGM conditional on the network 
at time t — 1 (and possibly time t — 2, etc.), extending the Temporal ERGM 



(TERGM) of |Hanneke and Xing| p007| and |Hanneke et aL| ( |20T0[ ) . In this 
section we specify the model and discuss its fundamental properties. 



2.1 Model Definition 

Suppose that N is the set of n — \N\ actors of interest, labeled 1, ...,n, 
and let Y C N x N be the set of potential ties among them — with pairs 
G Y ordered for directed and unordered for undirected networks — and 
let y C 2 Y be the set of possible networks of interest formed among these 
actors. For a network realization y G y, define y i • to be an indicator of a tie 
from actor i to actor j , and further let y { . be the set of actors to whom i has 
a tie, y. j the set of actors who have ties to j, and y { the set of actors with 
undirected ties with i. Let Y l G y be a random variable representing the 
state of the network at the discrete time point t and y t G y be its realization. 
Following Hunter and Handcock (2006), let G be a vector of q 



model parameters, and let rj(6) : — > MP be a mapping from 6 to natural 
parameters r] G W, with q < p. Let g : y 2 — > MP be the sufficient statistic 
for the transition from network y l 1 at time t — 1 to network y l at time t. 
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The one-step transition probability from y* 1 to y* is then defined to be 



^ S («,H'-') 

or, with a fc-order Markov assumption, and letting g : y k+1 

Pr^r* = y^Y*- 1 = y*" 1 , . . . , Y*"* = y*- fc ; 0) = 
exp (77(6/) ■ g(y\ y*" 1 , . . . , y*~ fe )) 



(1) 



t-k\ 



y'- fc ey, (2) 



and 



y*"*) = exp (rj(0) • <?(y', y'" 1 , . . . , y'- fc )) 



the normalizing constant. 

TERGMs are a natural elaboration of the traditional ERGM framework. 
They are essentially stepwise ERGM in time. Note that the definitions 



of Robins and Pattison (2001) and Hanneke and Xing (2007) used linear 



ERGMs only, where T)(0) = 9 and p = q. To simplify notation, from this 
point on we suppress reference to 77 and g. 



2.2 Model Specification and Interpretation 

The class of models specified by ([!]) is very broad and a key component 
of model specification is the selection of g. Natural candidates are those 



developed for cross-sectional networks, such as those enumerated by Morris 



Handcock, and Hunter (2008). However, the choices in this dynamic situation 
are richer and can be any valid network statistics evaluated on y* especially 



those that depend on y . Hanneke and Xing (2007) focused on a choice of 



g that had the property of conditional dyadic independence — that 



" - y^fl) = J] Pr(Y«, = yljY'- 1 = y tl -0), (3) 

(M')eY 



Pr(Y* = y l \Y l 



the distribution of Y in which tie states are independent, but only condi- 
tional on the whole of Y 1 ^ 1 . 
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However, caution must be used in interpreting their parameters. Consider 
the simplest such statistic, the edge count: 

g(y t ,y t - 1 ) = \y t \. 

A higher coefficient on g will, for any y* -1 , produce a Y distribution in 
which networks with more ties have higher probability. But, note that this 
term would accomplish it in two ways simultaneously: it would both increase 
the weight of those networks in which more ties were formed on previously 
empty dyads and increase the weight of those networks in which more extant 
ties were preserved (fewer dissolved). That is, it would both increase the 
incidence and increase the duration. 

Hanneke and Xing (20071) gave an example of a statistic that controls the 



rate of evolution of the network: a measure of stability. This statistic counts 
the number of tie variables whose states did not change between time steps, 
which is then divided by the maximum number of ties an actor could have 
(a constant): 

(i,i)6Y 

A higher coefficient on it will slow the evolution of the network down and a 
lower coefficient will speed it up. From the point of view of incidence and 
duration, however, it will do so in two ways: a higher coefficient will result 
in networks that have fewer new ties formed and fewer extant ties dissolved 
- incidence will be decreased and duration will be increased. 

The two-sided nature of these effects tends to muddle parameter interpre- 
tation, but a more substantial issue arises if selective mixing statistics, like 



those described by Koehly, Goodreau, and Morris (2004), are used. Con- 



sider a concrete example, with actors partitioned into K known groups, with 
IK C {1, . . . , K} 2 being the set of pairs of groups between whose actors there 
may be ties. (For example, in a directed network, K = {1, . . . ,K} 2 .) Let 
Pk be the set of actors who belong to group k and P(i) be the partition to 
which actor i belongs. The model with transition probability 

Pr(Y* = y^Y"* -1 = y* -1 ;0) oc 
™l> I 0o £ (;< ; ?v!./ + (1 - <4-)(l - yg 1 )) + ^M\Vp k ,j) 



&2 1 



(ij')eY (fci,fc 2 )eK 



(4) 
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models stability, controlled by O , and mixing among the groups, controlled 
by 0ki,k 2 - (Here, \y P Pk | is defined as the number of ties from actors in 
group hi to actors in group ki for directed networks, and ties between actors 
in those groups for undirected networks.) 

Given y* _1 , the probability that a given non-tied directed pair will 
gain a tie in a given time step is 

Pr(y*, = l\Ytf = 0; 0) = legit" 1 (-0 O + 0p(i),P(;)), 

and the probability that an extant tie will be removed is 

Pr(Yl = 0\Y*f = 1; 0) = logit^-flo - Op {i) ,P W ), 

the latter leading to a duration distribution which is geometric with support 
N and expected value (Casella and Berger, 2002, pp. 621-622) 

(logit _1 (-0o - 6p(i),P{j))) 1 = 1 + exp (0 O + Qp{i),p(j)) ■ 

Thus, a higher value of coefficient OkiM simultaneously increases the inci- 
dence of ties between actors in group k\ and actors in group k 2 and their 
duration. 

This coupling between the incidence of ties and their duration not only 
makes such terms problematic to interpret, but has a direct impact on mod- 
eling. Consider a sexual partnership network, possessing strong ethnic ho- 
mophily, with ties within each ethnic category being more prevalent (relative 
to the potential number of ties) than ties between ethnic categories. (A real- 



world illustration of this effect was given by Krivitsky, Handcock, and Morris 



( 2011[ ).) This structure could be a consequence of the within-ethnic ties be- 
ing formed more frequently than between-ethnic ties, of the within-ethnic 
ties lasting, on average, longer than between-ethnic ties, or some combina- 
tion of the two. With cross-sectional data alone, it is impossible to tell these 
apart and a model like (|4]) implies a dynamic process in which cross-ethnic 
ties toggle unnaturally frequently, or "churn" . We refer to a model with this 
dynamic pathology as a "churning model" as this stochastic property is un- 
likely to be seen in real phenomena. Churning is related to the degeneracy 
properties of ERGM (iHandcockL |2003b. 



3 Separable Parametrisation 

We now motivate and describe the concept of separability of formation and 
dissolution in a dynamic network model, and describe the Separable Temporal 



6 



ERGM (STERGM). 



3.1 Motivation 

Intuitively, those social processes and factors that result in ties being formed 
are not the same as those that result in ties being dissolved. For exam- 
ple, in the above-mentioned sexual partnership network, the relative lack of 
cross-ethnic ties may be a result of racial segregation, language and cultural 
barriers, racism, and population-level differences in socioeconomic status, all 
of which have a strong effect on the chances of a relationship forming. Once 
an interracial relationship has been formed, however, either because these 
factors either did not apply in that case or were overcome, the duration of 
such a relationship would likely not be substantially lower. Even if it were 
lower, the differences in the probability of such a relationship ending during 
a particular time interval would not, in general, be a perfect reflection the 
differences in the probability of it forming during such a time interval. 

Furthermore, it is often the case in practice that information about cross- 
sectional properties of a network (i.e. prevalence) has a different source from 
that of the information about its longitudinal properties (i.e. duration), 



and it may be useful to be able to consider them separately (Krivitsky and 



Handcock 


2008 


Krivitsky , 


2009) 



Thus, it is useful for the parametrisation of a model to allow separate con- 
trol over incidence and duration of ties and separate interpretation, at least 
over the short run. (For any nontrivial process, formation and dissolution 
would likely interact with each other in the long run.) 



3.2 Model Specification 

In this section, we introduce a class of discrete-time models for network 
evolution, which assumes that these processes are separable from each other 
within a time-step. We consider a sub-class of models based on the ERGM 
family, which inherits the interpretability and flexibility of those processes. 



3.2.1 General Separable Models 

We represent networks as sets of ties, so given y, y' 6 y, the network yUy' 
has the tie if, and only if, (i, j) exists in y or y' or both; the network 
yC\y' has (i,j) if, and only if, (i, j) exists in both y and y'\ and the network 
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Table 1: Possible transitions of a single tie variable 
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— > 
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— >• 
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— >• 
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->• 
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— >• 


(1,1) 


— >• 


1 



y\y' has tie if, and only if, exists in y but not in y' . The relation 
y D holds, if, and only if, y has all of the ties that y' does (and, possibly, 
other ties as well), and conversely for y C y' . 

Consider the evolution of a random network at time t — 1 to time i, 
and define two intermediate networks, the formation network Y + , consisting 
of the initial network Y l ~ l with ties formed during the time step added 
and the dissolution network Y~ , consisting of the initial network Y 1 " 1 with 
ties dissolved during the time step removed (with y + and y~ being their 
respective realized counterparts). Then, given y* _1 , y + , and y~ , the network 
y l may be evaluated via a set operation, as 

yt = y+\(y^\y-) = y~ U (y^ 1 ). (5) 

Since it is the networks y l ~ x and y l that are actually observed, y + and 
may be regarded as latent variables, but it is possible to recover them 
given y l 1 and t/, because a tie variable can only be in one of four states 
given in Table [l] Each possibility has a unique combination of tie variable 
states in y l ~ x and y l , so observing the network at the beginning and the end 
allows the two intermediate states to be determined as y + = y l ~ x U y l and 

y — y t l n y l . 

If Y + is conditionally independent of Y~ given Y 1 ^ 1 then 

ViiY 1 = y'lY'- 1 = = 

Pr(Y+ = 7/ + |Y <_1 = y^ 1 ; 0) x Pr(Y" = y'lY^ 1 = y*" 1 ; 0) (6) 

We refer to the two factors on the RHS as the formation model and the 
dissolution model, respectively. Suppose that we can express 6 = (6 + , 0~) 
where the formation model is parametrised by 6 + and the dissolution model 
by . 

Definition. We say that a dynamic model is separable if Y + is conditionally 
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independent of Y given Y l 1 and the parameter space of is the product 
of the individual parameter spaces of 6 + and 6 . 

We refer to such a model as separable because it represents an assumption 
that during a given discrete time step, the process by which the ties form 
does not interact with the process by which they dissolve: both are separated 
(in the conditional independence sense) from each other conditional on the 
state of the network at the beginning of the time step. 



3.2.2 Generative Mechanism 

Let some y + (y t ~ 1 ) C {y G 2 Y : y D y 1 ^ 1 } be the sample space, under the 
model, of formation networks, starting from and let some y~{y t ~ 1 ) Q 
{y G 2 Y : y C y l ~ 1 } be the sample space of dissolution networks. The model 
postulates the following process for evolution of a random network at time 
t — 1 to a random network at time t: 

1. Draw an intermediate network y + from the distribution 

Pt(Y + = y+lY 1 - 1 = y t - 1 ;e + ), y + G 

2. Draw an intermediate network y~ from the distribution 

Pr(r- = y-\Y 1 - 1 = y t x - ), y G yd/*" 1 ). 

3. Apply formations and dissolutions to y l 1 to produce y l by evaluating 
©■ 

Note that, as specified, this model is first order Markov, but Y l can be 
further conditioned on l^* -2 , l^* -3 , etc, to produce higher order versions. 
We do not develop these models here. 



3.2.3 Separable Temporal ERGM (STERGM) 

A natural family of models for the components of the separable model is the 
ERGMs considered in Section 2.1. We focus on this rich class of models in 
the remainder of the paper. Specifically, we model: 



, . , exp (/7+(0 + ) ■ g + (y + ,y t - 1 )) , , 

Pr(Y+ = y + \Y 1 - 1 = y*" 1 ;^ = , a+ t » V e 
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and 



with their normalizing constants c n + 19 + {6 + , y l ~ l ) and c v - >g - (6~ , y*" 1 ) sum- 
ming over 3^ + (?/* _1 ) and 3^~(y* _1 ), respectively. 

We now derive the probability of transitioning from a given network at 
time t — 1, y* _1 to a given network at time t, y*. Based on (j6l), we have 



Pr(Y* = = y* _1 ;0 



exp(ry(0) ■ g{y\y l v )) 

c v + g+ (6 + ,yt-i)c v - g -(6-,y^] 



where 77 = (77+, 77 )andy(y*,y* l ) = {g + {y l 1 Uy t ,y t x ),g (y* 1 ^y t ,y t 
As Pr(V* = y*!^*" 1 = y* 1 ; 0) is, by construction, a valid probability mass 
function, 

c ?7 + g+ (0 + ,y'- 1 ) Cr? -, 9 -(0-,y'- 1 ) = 0^(0, y*- 1 ), 

where 

Cr ,, 9 (0, y*- 1 ) = ]T exp (77(0) ■ g(y', y 1 ' 1 )) . 
y'ey 

This is the same form as Thus, the STERGM c lass is a subclass of a first 
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order Markov TERGM of Hanneke and Xing (2007), described in Section 
any transition process that can be expressed with g + , g , rj + and 77 can 
be reproduced by a model in the TERGM class. However, the essential 
issue is the specification of models within these classes, and the value of the 
STERGM class is that it focuses specification on a viable and fecund region 
in the very broad class. In the parametrisation in terms of formation and 
dissolution, some flexibility is lost — the ability to have the formation and 
dissolution processes interact within a given time step. What is gained is 
ease of specification, tractability of the model, and substantial improvement 
in interpretability. 



3.3 Interpretation 



In contrast to statistics like stability in Section [272] , the STERGM's sufficient 
statistics and parameters have an implicit direction: they affect directly ei- 
ther incidence or duration, but not both, and even statistics that do not 
explicitly incorporate the previous time step's network y* -1 , incorporate it 
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via the constraint of the phase in which they are used. This allows famil- 
iar cross-sectional ERGM sufficient statistics to be used, with their param- 
eters acquiring intuitive interpretations in terms of the network evolution 
process. We call these inherited terms, for which g\ (y + , y' -1 ) = y^(y + ) 
and/or y^y^y*" 1 ) = y& (y~), with no further dependence on y t_1 , implic- 
itly dynamic. 

Such terms (and their corresponding coefficients) often have straightfor- 
ward general interpretations for formation and dissolution phases. In par- 
ticular, consider an implicitly dynamic statistic that counts the number of 
instances of a particular feature found in the network y + or y . Exam- 
ples of features that might be counted include a tie, an actor with exactly d 
neighbors, or a tie between an actor in a set and an actor in the set Pt 2 . 

3.3.1 Formation 

A positive 6^ corresponding to a particular g\ increases the probability 
of those y + which have more instances of the feature counted by gj. 
greater values of g^(y + )- This affects the network process in two ways: the 
probability of forming those ties that create new instances of the feature 
counted by g\ is increased and the probability of forming those ties that 
"disrupt" those instances would be reduced. 

Conversely, negative Q\ would result in higher probabilities for those 
networks with fewer instances of the feature counted by g^, reducing the 
probability of forming ties to create more instances of the feature counted 
and increasing the probability of forming ties to "disrupt" the feature. 

Notably, g\ counts features in the network y + = y t Uy t ~ 1 , rather than in 
the ultimately observed network y*. This means that for some features, par- 
ticularly those with dyadic dependence, the dissolution process may influence 
the feature so that it is present in y + but not in y' -1 or y*. How frequently 
this occurs depends on the specific model and the rate of evolution of the 
network process: if a network process is such that the network changes little 
(in both formation and dissolution) during each time step, such interference 
is unlikely. 

3.3.2 Dissolution 

As in the formation phase, a positive 6^ corresponding to a particular g^ 
increases the probability of those y - which have more instances of the fea- 
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ture counted by g^, thus tending to preserve more instances of that feature 
(or dissolving ties to create more instances, as may be the case with dyadic- 
dependent terms), while a negative 6^ will increase the probability of net- 
works with fewer instances of the feature in question, effectively causing the 
dissolution process to target those features, and also refrain from dissolving 
ties whose dissolution would create those features. It is important to note 
that the dissolution phase ERGM determines which ties are preserved during 
the time step, and the parameters should be interpreted accordingly. 

Again, it is y~ — y l D y l ~ x on which statistics are evaluated, so the 
formation process can interact with the dissolution process as well. 

These principles mean that many of the vast array of network statis- 



tics developed for ERGMs (Morris et al. , 2008, for example) can be readily 



adapted to STERGM modeling, retaining much of their interpretation. In 
the Appendix, we develop and give interpretations to the fundamental edge 
count, selective mixing by actor attribute, and degree distribution terms. 



3.3.3 Explicitly Dynamic Terms 

At the same time, some effects on formation and dissolution may depend on 
specific features of y l ~ l . For instance, consider a social process in which an 
actor having multiple partners (e.g., "two-timing") is actively punished, so 
having more than one partner in y l ~ x increases the hazard of losing all of 
one's partners in y l . (Such an effect may be salient in a sexual partnership 
network.) This dissolution effect cannot be modeled by implicitly dynamic 
terms, because it cannot be reduced to merely increasing or reducing the 
tendency of Y to have particular features. For example, a positive coeffi- 
cient on a statistic counting the number of actors with no partners (isolates) 
would increase the weight of those y~ that have more isolates, affecting the 
dissolution of the sole tie of an actor with only one partner just as much as 
it would affect the dissolution of ties of an actor with more than one partner. 

On the other hand, an explicitly dynamic model term that counts the 
number of actors with no partners in y~ only among those actors who had 
two or more partners in y l ~ l would, with a positive coefficient, increase 
the probability of a transition directly from having two partners to having 
none. Beyond that, its interpretation would be no different than that of an 
implicitly dynamic dissolution term. 
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3.4 Continuous-Time Markov Models 



Although the focus of this paper is on discrete-time models for network evo- 
lution, the separability paradigm can be applied to continuous-time network 



evolution models such as those of Holland and Leinhardt (1977). There, 



network evolution is modeled as a continuous-time Markov process such 
that the intensity of transition between two networks that differ by more 
than one dyad is 0, while the evolution of the network is controlled by 
A(y*; 0) : y — > with each A^y*; 6) being the intensity associated with 
toggling each dyad (i, j). 

In that scenario, separation of formation and dissolution is realized by 
formulating 6 = (6 + , 6~) and 



\ iJ (y t ;0 + ,0- 



A+.( 2 / t ;0+) if y\j = Q 



with Xf^{y t ; 6 + ) and \ , ? (y*; 6 ) being fo rmation- and dissolution-specific in 



tensities. Indeed, Holland and Leinhardt (1977) use a formulation of this gen- 



eral sort. Notably, unlike the discrete-time process, this separation requires 
only separation of parameters and no additional independence assumptions. 
This is because under the Markov assumption and with no chance of more 
than one dyad toggling coincidentally at a specific time, dyads effectively 
evolve independently in a sufficiently small interval (i.e., [t,t + h], h — > 0), 
and dyadic independence in network evolution a fortiori implies separability 
between which ties form and which ties dissolve. 
An exponential-family form for Ajj, 



-V9,(u)(y;0) 



'exp (t7+(0 + ) • g+W U y*)) if y\ 3 = 

exp (r)-(0~) ■g~(y t \{(i,j)},y t )) if y\j = 1 



may be viewed as the limiting case of the discrete-time STERGM, in which 
the amount of time represented by each time step shrinks to zero. 



4 Likelihood-Based Inference for TERGMs 



In this section, we consider inference based on observing a series of T + 1 



networks, y 



y 



Hanneke and Xing (2007) proposed to fit TERGMs by 
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finding the conditional MLE under an order k Markov assumption 

T 

LX 



= argmax JJPr(Y' = y' Y 1 



t-k 



...,Y 



t-i 



v' -1 ;*), (?) 



t=k 



computing a Method-of-Moments estimator (equivalent in their case to the 
MLE) with a simulated Newton-Raphson zero-finding algorithm. We extend 
the work of Hunter and Handcock (2006) and Geyer and Thompson (1992) 
to compute the conditional MLE for curved exponential-family transition 
models (that is, cases where f](0) ^ 0). 

For simplicity, we consider models with first-order Markov dependence. 
There is no loss of generality, since as long as the order of Markov depen- 
dence k is finite, we can define the depended-upon network y* -1 to implicitly 
"store" whatever information about y t_1 , . . . , y l ~ k+l is needed to compute 
the transition probability. 

The conditional MLE Q can then be obtained by maximizing the log- 
likelihood 



T 



m=v(0) ■ £</(?/, I/" 1 ) -log ilc^y* 1 ) 



,t=i 



,t=i 



For any two values of the model parameter and 0, the log-likelihood-ratio 
is 

K0) -i(0°) = („(*) -tko) Ye^w- 1 )) -^g fn ^'^i) ) • 

The main difficulty is in evaluating the ratio of the normalizing constants. 
These conditional normalizing constants depend on networks at times 0, . . . , T— 
1. However, these ratios can still be expressed as 



t=1 C V,9\° ' » J t =l yey 



exp (r/(fl°)- <?(?,, t/' 1 )) 



T 



J] £ exp ((r,(0) - ri(0 )) ■ g(y, y*" 1 )) X Pr(Y< = y^ 1 = y' \ 0°). 

t=i y&y 



(8) 
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The expression ([8]) is a product of expectations over the conditional distribu- 
tion under the model of Y* given Y l ~ l at 6°, each of which can be estimated 
by simulation, allowing the algorithm of Hunter and Handcock (2006) to be 
applied to fit a TERGM to network series data. 

These results also make it possible to assess the goodness-of-fit of a model 
via an analyses of deviance. Specifically, we can compute the change in log- 
likelihood from the null model (t)(&) = 0) to the conditional MLE. To do 
this, we extended the bridge sampler of |Hunter and Handcock (2006) to this 
setting. 



5 Application to the Dynamics of Friendship 



As an application of this model, we consider the friendship relations among 26 



students during their first year at a Dutch secondary school (Knecht, 2008). 



The friendship nominations were assessed at four time points at intervals of 
three months starting at the beginning of their secondary schooling. The 
friendship data are directed and were assessed by asking students to indicate 
classmates whom they considered good friends. There were 17 girls and 9 
boys in the class. The data included covariates collected on each student. 
Here, we consider the sex of the student, as it is a primary determinant of the 
friendship ties. We also consider a dyadic covariate indicating if each pair 
of students had gone to the same primary school. These data were used to 



illustrate the actor-oriented approach to modeling by Snijders et al. (2010) 



(whom we follow). That paper should be consulted for details of the data 
set and an alternative analysis. 

Some of the data at time points two through four were missing due to 
student absence when the survey was taken. These were accommodated using 



the approach of Handcock and Gile (2010) under the assumption that the 



unobserved data pattern was amenable to the model. One student left the 
class after time point 1. This could have been accommodated in a number 
of ways, depending on the assumptions one is willing to make. Here we 
considered the networks with this student omitted both as a nominator and 



nominee of friendships. As Snijders et al. (2010) note, each student was 



allowed to nominate at most 12 classmates at each time point. In general, 
inference needs to incorporate features of sampling design such as this one. 
We discuss how in Section [6] However, its effect here is negligible: in the 
(4 x 25 =) 100 student reports, only 3 nominated the maximum number. 
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Our objective is to explain the observed structural patterns of change in 
the network over the course of the year. We build a model including both 
exogenous and endogenous structural effects, following the same approach 
and motivations as Snijders et al. (2010). For the formation component we 



include terms for the propensity of students to choose friends of the same 
or opposite sex (i.e., overall propensities to nominate friends that are ho- 
mophilous on sex or not). We include a term to measure the propensity of 
friendships to be reciprocal. We include information on the primary school 
co-attendance via a count of the number of times students nominate other 
students with whom they went to primary school. To capture any overall 
propensity of students to nominate other students who are popular we in- 
clude an overall outdegree popularity effect (Snijders et al. 2010, equation 
(12)). To model transitivity effects we include two terms. The first is aggre- 
gate transitive ties that aims to capture a tendency toward transitive closure 
consistent with local hierarchy. The second is an aggregate cyclical ties term 
to capture anti-hierarchical closure. The terms in the model are structurally 
largely consistent with the terms chosen in Snijders et al. (2010). A similar 
model was considered for the dissolution process. Specifics of these terms are 
given in the Appendix. 

We fit the model using the conditional MLE procedure of Section |4j Com- 
putationally this is implemented using a variant of the MCMC approach of 



Hunter and Handcock (2006). To monitor the statistical properties of the 



MCMC algorithm we use the procedures by Hunter, Goodreau, and Hand- 



cock (2008a). 



Table [2] reports the estimates for the model assuming homogeneity of 
parameters over time. The outdegree popularity effect had a correlation of 
0.995 with the edges effect and was omitted from the model. 

As for the standard ERGM, the individual 9 coefficients can be interpreted 
as conditional log-odds ratios. There is also a relative risk interpretation that 
is often simpler. For example, the exponential of the primary school coeffi- 
cient is the relative risk of formation or preservation (depending on the phase) 
of friendship between two students from the same primary school compared 
to two students from different primary schools with the same values of the 
other covariates and structural effects. The probabilities involved are con- 
ditional on these other covariates and structural effects. The interpretation 
for non-binary and multiple covariates is similar: exp(#A) is the relative risk 
of friendship between two students compared to two students with vector of 
covariates differing by A (and with the same values of the other structural 
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Table 2: MLE parameter estimates for the longitudinal friendship network 

Formation Dissolution 



Parameter 


est. 


(s.e.) 


est. 


(s.e.) 


Edges 


-3.336 


(0.320)*" 


-1.132 


(0.448)* 


Homophily (girls) 


0.480 


(0.269) 


0.122 


(0.394) 


Homophily (boys) 


0.973 


(0.355)** 


1.168 


(0.523)* 


F— >M heterophily 


-0.358 


(0.330) 


-0.577 


(0.609) 


Primary school 


0.650 


(0.248)** 


0.451 


(0.291) 


Reciprocity 


1.384 


(0.280)*** 


2.682 


(0.523)*** 


Transitive ties 


0.886 


(0.247)*** 


1.121 


(0.264)*** 


Cyclical ties 


-0.389 


(0.133)** 


-1.016 


(0.231)*** 



Significance levels: 0.05* , 0.01** , 0.001 



effects). 

The standard errors of Table [2] are obtained from the information matrix 
in the likelihood evaluated at the MLE to which we have added the (small) 
MCMC standard error obtained using the procedure given by Hunter, Hand-| 



cock, Butts, Goodreau, and Morris (2008b). 



The networks at the earlier time points are strongly sexually segregated, 
and we see strong homophily by sex in the formation of ties. This effect is 
mildly stronger for boys than for girls. We do not see an overall disinclination 
for girls to nominate boys (relative to other combinations). In other words, 
the boys are about as likely to form friendships as the girls. As expected, we 
see a high degree of reciprocity in the formation of ties. There is a strong 
transitive closure effect, with a positive coefficient on transitive tie formation 
and a negative coefficient on cyclical tie formation. This suggests a strong 
hierarchical tendency in the formation of ties. We see that students who 
attended the same primary school are much more likely to form ties. 

These structural terms have less influence on the dissolution of ties. There 
is some modest evidence that boy to boy ties are less likely to dissolve than 
other mixtures of sexes. (Recall that parameters represent a measure of per- 
sistence, so that negative parameters are associated with shorter durations). 
As expected, we see the dissolution of ties is strongly retarded by the presence 
of a reciprocal tie. As in the formation process, there is a strong transitive 
closure effect suggesting a strong hierarchical tendency in the dissolution of 
ties. Once a hierarchical triad is formed it will tend to endure longer. Stu- 
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Table 3: Analysis of deviance for the longitudinal friendship network, com- 
paring time- homogeneous (horn.) and time-heterogeneous (het.) parametri- 
sations 

Formation Dissolution 





residual 


explained 




residual 


explained 




Model 


dev. (d.f.) 


dev. (d.f.) 


AIC 


dev. (d.f.) 


dev. (d.f.) 


AIC 


Null 


1838 (1326) 




1838 


459 (331) 




459 


Edges (horn.) 


924 (1325) 


915 ( 1)*** 


926 


431 (330) 


28 ( 1)*" 


433 


Full (horn.) 


819 (1318) 


104 ( 7)*** 


835 


350 (323) 


82 ( 7)"* 


366 


Full (horn. 














except edges) 


818 (1316) 


2(2) 


838 


344 (321) 


6( 2) 


364 


Full (het.) 


795 (1302) 


23 (14) 


843 


314 (307) 


30 (14)** 


362 



Significance levels: 0.05* , 0.01** , 0.001*** 



dents who attended the same primary school are not significantly more likely 
to have persistent ties. 

As the data measure a social process that is developing in time, we do not 
need to assume that the process is in temporal equilibrium; thus we could 
estimate separate parameters for the change between each pair of successive 
time points. One such model specifies different overall rates of tie formation 
or dissolution at each time point but retains homogeneous parameters for the 
other terms. Another allows all the parameters to vary at each time point. 

Table [3] gives the analysis of deviance for formation and dissolution mod- 
els nested above and below those in Table [2] For the formation process we 
see the full time-homogeneous model in Table [2] significantly improves on 
the null and Erdos-Renyi model (Edges (horn.)). Specifying different overall 
rates of tie formation at each time point does not significantly improve the 
fit, nor does a full time-heterogeneous model with different structural param- 
eters at each time point. For the dissolution process, we again see the full 
time-homogeneous model significantly improves on the null and Erdos-Renyi 
model. However, there is some evidence that specifying time-heterogeneous 
versions improve the fit. An inspection of the time-heterogeneous models 
indicates that most of the improvement is due to the increase in hierarchi- 
cal tendency over time. Initially this transitive closure does not retard tie 
dissolution, but it does over time. 
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6 Discussion 



This paper introduces a statistical model for networks that evolve over time. 
It builds on the foundations of exponential-family random graph models for 
cross-sectional networks and inherits the flexibility and interpretability of 
these models. In addition, it leverages the inferential and computational 
tools that have been developed for ERGMs over the last two decades. 

As we showed in Section [2j parameters used in models currently in use 
directly affect both the incidence of ties (at a given time point) and the 
duration of ties (over time). STERGMs have one set of parameters control 
formation of new ties and another control dissolution (or non-dissolution) 
of extant ties. Such a separable parametrisation controls the attribution of 
incidence and duration and greatly improves the interpretability of the model 
parameters, all without sacrificing the ability to explicitly incorporate effects 
of specific features of past networks, if needed. 

It is important to emphasize that STERGMs jointly model the formation 
and dissolution of ties. While the two processes are modeled as conditionally 
independent within a time step, they are modeled as dependent over time. 
More importantly, they allow the structure of the incidence to be identified 
in the presence of the durational structure. 

In addition, the model has computational advantages. The likelihood 
function can be decomposed and the components computed relatively easily. 



All computations in this paper were completed using the ergm (Hunter et al. 



2008b Handcock, Hunter, Butts, Goodreau, Krivitsky, and Morris, 2012) 



package from the statnet (Handcock, Hunter, Butts, Goodreau, and Morris 



2008 ) suite of libraries for social network analysis in R ( R Development Core 



TeamJ|2009J). 

The model is directly applicable to both directed and undirected net- 
works. It can be easily tuned to applications by appropriate choices of terms 
for both the formation and dissolution processes, as we show in Section[5] Be- 
cause it is based on ERGMs, it will share in advances made on those models 
as well. The model is very useful for simulating realistic dynamic networks. 
This is because of the sequential specification, the tractable parameters and 
the relatively light computation burden. 

As illustrated in Section [5j missing data on the relational information can 



be dealt with in likelihood-based inference using the approach of Handcock 



and Gile (2010). If the longitudinal data are partially observed due to either 



a sample design or a missing data process and is amenable to the model then 
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their method is directly applicable. 

The assumption of within-step independence of formation and dissolution 
is an important one, and its appropriateness depends on the substantive 
setting and the basic nature of the process. Some settings do not allow a 
separable formulation at all. For example an affiliation network of players 
to teams in some sports, with a realization observed during every game, 
imposes a hard constraint that a player must belong to exactly one team 
at a time, and no team can have more or fewer than a particular number 
of players, so the basic unit of network change is teams trading players, 
rather than a player joining or quitting a team. In settings that do allow 
simpler atomic changes, separability may be a plausible approximation if 
the amount of change between the discrete time steps is relatively small - 
that each time step represents a fairly small amount of time. As the length 
of the time step increases, the separability approximation may become less 
and less plausible. For example, a marriage network, even though it has a 
hard constraint of each actor having at most one spouse at a time, could be 
plausibly approximated in a separable framework (using, e.g., y + {y t ~ 1 ) = 
{y G 2 Y : y D y 1 ' 1 A W^n \ Vi\ < 1}) if each discrete time step represented 
one month (since relatively few people divorce and remarry in the same 
month) but not if it represented ten years. More generally, the simpler the 
formation and dissolution processes are within a time-step and the weaker the 
dependence between them, the more plausible the assumption. (Of course, 
continuous-time Markov models, to which these models asymptote, do not 
require an independence assumption at all.) 

As with the data used in Section [5j restriction on the number of alters 
reportable is a common feature of network surveys. Other examples of this 



censoring include the Add Health friendship networks (Harris, Florey, Tabor. 



Bearman, Jones, and Udry, 2003) and Sampson's monastery data (Sampson 



1968). To the extent that these are features of the sampling design, they 



should be reflected in the likelihood. Per Section 3.2.3, a STERGM can be 



represented as a TERGM ([I]), which allows the sample space y of Y l to be 
constrained to reflect this design. Changing y only affects c V! g(6) in 1(0) 
- the kernel of the model remains separable. This situation is similar to 
that with censored data in survival analysis where the likelihood is altered to 
reflect the censoring while the model, and its interpretation, is unchanged. 

Since assuming separability between formation and dissolution grants sig- 
nificant benefits to interpretability, it would be useful to be able to test if 
separability may be assumed in a given network process. Some avenues for 
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such tests include comparing goodness-of-fit of a given model in modeling a 
transition y° — > y 2 to its modeling a transition y° — > y 1 — > y 2 (with ho- 
mogeneous parameters). Or, if only one transition is available, a transition 
yO yi ^ a transition y° — > Y? — > y 1 , with a latent intermediate net- 
work Y? . Development of such tests is beyond the scope of this work and is 
subject for future research. 

The STERGM framework allows a number of extensions to the model. 
Over time, networks do not merely change ties: actors enter and leave the 
network, and actors' own attributes change. It is possible to incorporate 
the network size adjustment developed by Krivitsky et al. (2011) into these 
dynamic models. We have focused on longitudinal data. It is possible to 
fit the model based on egocentrically sampled data when the data includes 



durational information on the relational ties (Krivitsky and Handcock, 2008 



Krivitsky, 2009). 
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A Separable TERGM Terms 



In this appendix we derive and discuss some fundamental model terms that 
can be used in a STERGM. 

A.l Edge Counts 
A. 1.1 Formation 

Let g + (y + ,y t ~ 1 ) = \y + \ - This is equivalent to g(y l , y 1 ' 1 ) = |2/*U2/ t-1 |. If 
ylj 1 = 1, y\ j V ylj 1 = 1, so the state of yjj has no effect on y* -1 ), 
but if ylj 1 = 0, y\j V t/'j 1 = y*j, and the change in g{y l ,y l ~ x ) is 1. This 
means that, in the absence of other formation terms, 6 + represents the log- 
odds of a given tie variable, that does not already have a tie, gaining a tie. 
Then logit _1 (# + ) is the expected fraction of tie variables empty at time t — 1 
gaining a tie at time t. In the presence of other terms, these log-odds become 
conditional log-odds-ratios. 



A. 1.2 Dissolution 

Let g (y , y* -1 ) = \y~\ , or, equivalently, g(y\ y*" 1 ) = \y l n y* _1 |. liy 1 ^ = 
0, yjj A ylj 1 = 0, so the state of y\^ has no effect on g(y l ^y^ 1 ), but if 
ylj 1 = 1, yj j A y^ 1 = y\j, and the change in g{y t ,y t ^ 1 ) is 1. Then, in the 
absence of other dissolution terms, 6~ represents the log-odds of a given tie 
that exists at t — 1 surviving to t, and logit -1 (# - ) is the expected fraction 
of ties extant at time t — 1 surviving to time t. Depending on the problem, 
the interpretation of —6 might be more useful: logit _1 (— ) is the expected 
fraction of extant ties being dissolved — the hazard. 

The formation phase can only affect non-tied pairs of actors, so if the 
dissolution phase statistics have dyadic independence, the formation process 
has no effect on duration distribution: in the absence of other dissolution 
terms, the duration distribution of a tie is geometric (with support N) with 



expected value (Casella and Berger, 2002, pp. 621-622) 



(logit^C-fl-)) 1 = l + exp(6T) 
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A. 2 Selective Mixing 



Selective mixing in the formation model can be represented by a vector of 
statistics g + (y + , y l ~ l ) = | yj Pk |, with notation described for Q. However, 
in the context of a STERGM, they have a direction. 



A. 2.1 Formation 

Let 9t lM (y + ',y* _1 ) = |y£ fel) pj (equivalents, g klM (v\ V 1 ' 1 ) = \Vp kl ,P k2 u 
y^T 1 p |). The change in its value due to adding a tie (i,j) (absent in y* _1 ) 

is lieP fcl AjGPfe 2 ) so ^,jfca i s the conditional log-odds-ratio due to the effect of i 
belonging to group k\ and j belonging to group k 2 of a dyad that does 

not already have a tie, gaining a tie. If the formation phase has no other 
terms, then the odds that V* • = 1 given that = are 

Odds^Y*, = llY-fj 1 = 0,i G P fcl Aj e P fe2 ;^ 1)fc2 ) = exp . 



A. 2. 2 Dissolution 

Similar to the formation case, selective mixing can be represented by a vector 
of statistics y"(y _ ,y^ 1 ) = |y Pfci ,pJ- Then, g klM {y t , y*" 1 ) = |yp fci , Pfc2 n 
yp^ ^ |, and 0^ l k2 is the conditional log-odds-ratio due to the effect of % 
belonging to group k\ and j belonging to group k 2 of an extant tie 
being preserved until the next time step. 



A. 3 Degree Distribution 

Unlike the first two examples, degree distribution statistics — counts of ac- 
tors with a particular degree or range of degrees — introduce dyadic depen- 
dence into the model. As with many other such terms, closed forms for many 
quantities of interest are not available, and conditional log-odds are not as 
instructive, but the general results for implicitly dynamic terms from Sec- 
tion 3 .3| provide a useful heuristic, with the caveats discussed in that section. 



In practice, these terms are often used in conjunction with other terms, 
so we only discuss their effect on the formation and dissolution probabilities 
conditional on other terms — their effect over and above other terms, with 
those terms' coefficients held fixed. 
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A. 3.1 Formation 

Let y i be the set of neighbors to whom i has ties in y. A formation degree 
count term has the form g^(y + , y 1 ^ 1 ) = J2 ieN ^\ y +\=d : ^ ne num ber of actors % 
in y + whose degree is d. The corresponding TERGM statistic g^y 1 , y 1 ^ 1 ) = 
YlicM 1| t, , *— ii j « We discuss the cases of d = and d — 1, with the cases 
for d > 1 being similar to the d = 1 case. 

d = By increasing the weight of those formation networks that have fewer 
isolates, a negative coefficient on this term increases the chances of a 
given actor gaining its first tie within a given time step. Conversely, 
a positive coefficient reduces the chances of an actor gaining its first 
tie. Because the term does not distinguish between different nonzero 
degrees, it mainly affects transitions from isolation to degree 1, not 
affecting further tie formation on that actor positively or negatively. 

d — 1 Unlike the statistic for d — 0, which can only be decreased by adding 
ties, the statistic for d — 1 can be both increased and decreased (by 
making isolates into actors with degree 1 and by making actors with 
degree 1 into actors with degree 2 and higher, respectively). Thus, 
the effect of this term is two-sided: with a positive coefficient, it both 
increases the chances of an actor gaining its first tie and reduces the 
chances of an actor gaining its second tie, while having relatively little 
effect on an actor with two ties gaining a third tie. A negative coefficient 
reduces the chances of an actor gaining its first tie, but if an actor 
already has one tie, it increases the chances that the actor gains a 
second tie. 

A. 3. 2 Dissolution 

The analogous term in the dissolution model is g^{y , y t_1 ) = X^eN l| y r|=d : 
same as formation, but applied to y~, and g^y^y 1 ' 1 ) = J2 ieN l| y t ny t-i| =d - 

d — A negative coefficient on this term in the dissolution phase increases 
the weight of dissolution networks that have fewer isolates, and thus 
reduces the chances of a given actor losing its only tie, while a positive 
coefficient increases the chances of an actor losing its only tie. It may 
also have a modest effect on actors with more than one tie, since there 
is a positive probability of an actor losing more than one tie in the 
same time step. 
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d — 1 As in the case of formation, the effect of this term is two-sided: with a 
positive coefficient — to preserve or create networks with more "monog- 
amous" ties — the chances of an actor losing its only tie decrease while 
the chances of an actor losing its second tie increase. (If an actor has 
3 or more ties, the effect is weaker.) 

A negative coefficient on this term both increases the chances that an 
actor's last tie will be dissolved and reduces the chances that an actor 
with more than one tie has any ties dissolve. 

A. 4 Other Standard Statistics 

Most statistics used in standard ERGM can be used in STERGM as implicitly 
dynamic statistics. For example, standard formation statistics are 



The corresponding dissolution statistics have the same form, with y + re- 
placed by y . 



Reciprocity: £(ij) e Y,i<j Vi/U,, 

Transitive ties: ^2 {iJ)eY yf j ma 1 x keN (y+ k y+ j ) 

Cyclical ties: £ (ij - )gY yt. max fce jv(y+ i/t fc ) 
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